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Abstract 

£S| ' In this paper we consider symmetric games where a large number of players can be in 

any one of d states. We derive a limiting mean field model and characterize its main prop- 
erties. This mean field limit is a system of coupled ordinary differential equations with 
initial-terminal data. For this mean field problem we prove a trend to equilibrium theorem, 
that is convergence, in an appropriate limit, to stationary solutions. Then we study the N+l- 
player problem, which the mean field model attempts to approximate. Our main result is the 
convergence as N — s> oo of the mean field model and an estimate of the rate of convergence. 
We end the paper with some further examples for potential mean field games. 
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1 Introduction 



Mean field games is a recent area of research started by Peter Caines and his co-workers IHMC06] , 
£> • [HCM07J , and independently by Pierre Louis Lions and Jean Michel Lasry [LL0 6a, LL06E1 ILL07ai 

ILL07b] which attempts to understand the limiting behavior of systems involving very large num- 
bers of rational agents which play dynamic games under partial information and symmetry as- 
sumptions. Inspired by ideas in statistical physics, in this class of models the individual player's 
contributions are encoded into a mean field that contains all relevant statistical properties about 
the ensemble. 



o 

pvi . The literature on mean field games and its applications is growing fast. For recent surveys see 

[LLGlOb] or [CarlOj , and reference therein. Mean field games arise in the study of growth theory in 
economics LLGlOa, MLllj . production of exaustible resources [LLGlOb] . or environmental policy 
[ALT[ , for instance, and it is likely that in the future they will play an important role in economics 
and population models. There is also a growing interest in numerical methods for these problems 
[ALT) . [ADlOj . A related concept, called oblivious equilibrium, corresponds to the case where 
players are assumed to make decisions based only on its own state and knowledge of the long-run 
average industry state and stationary equilibrium models were introduced and studied in detail 
in, respectively, [WBVR08] and |AJW11[ . Mean field models correspond to the limit of N player 
games under symmetry assumptions. The Markov perfect equilibrium notion for these games has 
been studied (mostly in discrete time or stationary setting) in [FS09, Liv02, MTOfl Ii?tr93[ . and 
references therein. In [PM01[ ISleOlj symmetric Markov perfect equilibrium are also considered, 
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and in the last paper the case with an infinite number of players is studied. In Kap95] the passage 
from discrete time to continuous time is considered for N players in a war of attrition problem. 
The techniques in the present paper are, however, substantially different from the above references. 

In this paper we begin by presenting a mean field model for a continuous time dynamic game 
between a large number of rational agents, which we call players. These are allowed to switch 
between a finite number of states, looking forward to optimize certain functionals, which depend 
on the statistical distribution of the other players. We discuss the concept of Nash equilibrium, 
which allow us to derive a system of ordinary differential equations for the distribution of the 
players, as well their value function. After, we consider the N + I-player game, which corresponds 
to the previous problem before taking the mean field limit. In the N + 1-player game each player 
knows only the state but not the identity of the remaining players. We are particularly interested 
in understanding the limit of the N + 1 player game as the number of players increases to infinity. 

In discrete time, finite number of state mean field models were studied in [GMS10] . In his PhD 
thesis, Gue09 , O. Gueant considered a problem with two states, modeling the labor market. In 
this work he considered a continuum of individuals and a labor market consisting of 2 sectors. Each 
individual has to decide on which sector he or she is going to work. This model consists in a coupled 
systems of ordinary differential equations of the type that will be derived in SJ2] More recently in 
[Guellbj . and [Guella] several discrete state problems have been also studied in detail, namely 
its connection with systems of conservation laws. Further models with discrete state space were 
also considered in TBEA09 , [HTAEAll . In these models, each individual in a large population 
interacts with randomly selected players. This interaction determines the instantaneous payoff 
for all involved players. In particular these authors establish several very interesting limit results. 
We should note, however that these last works do not study mean field games in the sense of this 
paper, namely they lack the forward-backward structure of the equilibrium as in the works by 
Caines, Lions-Lasry, among others. 

We start in S}2]by describing the mean field game. We derive a mean field model for the optimal 
switching policy of a reference player given the fraction 8(t) G [0, l] d of players in each of d states. 
Then we introduce the concept of Nash equilibrium. This equilibrium turns out to be determined 
by a coupled system of ordinary differential equations, where one equation governs the evolution 
of 0, and is subjected to initial conditions, whereas the other equation models the evolution of a 
value function and has terminal data. We call this problem the initial-terminal value problem. 
These models are similar to the ones in [Guellbj , [Guellaj . Initial terminal value problems are in 
fact a general feature in many mean field game problems, see for instance |LL06a[ l"LL06b[ lLL07aj . 
though not very common in ODE problems. In fact, existence and uniqueness of solutions is not 
immediate from the general ODE theory but, adapting the methods of Lions and Lasry we were 
successful in establishing both. We also study a class of contractive mean field games for which 
a-priori bounds on suitable norms can be established. In particular under this condition one can 
prove existence of stationary solutions. The main result of this section is a trend to equilibrium 
theorem, in the spirit of the results in IGMS10] . The proof relies on a reverse Gronwall inequality 
(i.e. when an integral of a function is controlled by the function at the endpoints). 

In |21 we consider the Nash equilibrium problem before taking a mean field limit, i.e. with a 
finite number, N + 1, of players. As before we suppose that all players are identical and so the 
game is symmetric with respect to permutation of the players. We adopt the point of view of a 
reference player, which could be chosen as any one of the players. We assume that this player 
(as any other) has access to the same information, namely, his/her own state at time t, given by 
it G {1, 2, 3, ..., d}, and the number n t G N d of remaining players that are in the other states. The 
objective of the reference player is to minimize, by controlling the process it, and given the process 
n t , the expected value of the integral of a running cost function added to a terminal cost. We 
assume that both i t and n t are controlled non-time homogeneous coupled Markov chains. More 
precisely, we suppose that N of the players have a fixed Markov switching strategy /3, known by 
the reference player, which then chooses a switching strategy a{(3). This is a well know Markov 
decision problem. The Nash equilibrium corresponds to a(j3) — ft, which can be characterized by 



a system of ordinary differential equations. In this setting the equilibrium is characterized by a 
system of ordinary differential equations with a terminal condition. This system, as explained in 
<j5l can be seen as a discretized version of a partial differential equation (introduced by Lions in 
his course in College de France and further studied by Gueant [Guellbi IGuellaj l for the value 
function that can be derived for the mean field model, as an alternative to the initial-terminal value 
problem formulation. In addition to this characterization we prove various bounds, uniformly on 
N, which then allow to address the passage to the limit problem, in $$] 

In 21 we prove the main result of the paper, Theorem [JJ which is the convergence as the 
number of players A*" — > oo in L 2 of the TV + 1-player model to the mean field model of fJU In a 
different setting, convergence to MFG model was established by [KLY11] using very interesting 
techniques from non-linear Markov chains. We should note that the techniques in that paper 
do not apply to the problem we consider were, as our problem has a different structure. Our 
convergence result, gives, for small T, a rate of convergence of the order -i=. For the proof we 
not need monotonicity assumptions. In particular this implies uniqueness of solution to the mean 
field problem for small time. Our proof uses a double Gronwall-type inequality where part of the 
integrand can be estimated forward in time, whereas other part can only be estimated backwards 
in time. 

In !J5] we end this paper with an important class of examples, namely potential mean field 
games. These have been studied in detail by Pierre Louis Lions (College de France course) and 
also in [Guellbi IGuellaj . For these mean field games several connections with Hamiltonian and 
Lagrangian dynamics can be derived which have interesting applications to planning problems. We 
also discuss a variational formulation in analogy to the results in [GSM11] and [GPSM11 , as well 
as some connections with partial differential equations, numerical methods and Hamilton- Jacobi 
equations. 

2 A mean field model 

In this section we derive a mean field model which, as we will show later, corresponds to the limit 
as the number of players tends to infinity of symmetric dynamic games with a finite number of 
players. 

We consider a continuous time dynamic game where a large number of players can be in any of d 
states. The players can switch from state to state and their decisions depend on certain optimality 
criteria which we will describe in the following. We suppose that all players are identical and so the 
game is symmetric with respect to permutation of the players. Players only know its own position 
and the fraction of players in each of the d states. Each player can control the transition rate 
from one state to another and incurs in both a running cost and a terminal cost which depends 
on its own state, on the state of the other players (through its distribution among states and not 
on individual player's states) as well as on the controls the player chooses. 

We will fix one of the players which will be called the reference player. Because the game is 
symmetric, the identity of this player is not important, and all other players have access to similar 
information. We further assume the mean field hypothesis, that is, since the number of players 
is very large, the only information available to the reference player is the distribution of players 
given by a probability vector 9 € S d , where S d is the probability simplex 

9 1 + ... + e d = i, 

9 l > Vi, 1 <i<d. 

Under the mean field hypothesis, the evolution of the vector 9 can be approximated by an ordinary 
differential equation as discussed in H2.ll 



2.1 Continuous time Markov process and the Kolmogorov equation 

We suppose that the players distribution among states is given by a probability vector 9(t) € S d . 
Let j3(t) £ ~R dxd represent a transition rate matrix depending on the time t, where 0ij(t) > if 
i y£ j, and /3u = — J2j^i Pij ■ We assume that the players switch from state to state according to 
a continuous time (inhomogencous) Markov process with transition rate matrix /?, which for now 
we suppose it is known. In the mean field limit, the fraction of players in each state 9 satisfies the 
Kolmogorov equation 

3 

The previous equation is complemented by an initial condition 9(0) = 9q G S d from which the 
evolution of the distribution of players 9^ : [0, T] — > S d is completely determined. For convenience, 
controls are also identified with a vector f3(i) <E R d with the convention that /3j(i) = fy, where 
(3j(i) denotes the j — th coordinate of /3(i) . 

2.2 Running and terminal costs 

We fix now a reference player and consider the optimization problem according to his/her point 
of view. We assume that the state of this player is driven by a continuous time discrete state 
optimal control problem in which he/she controls the switching rates from state to state. These 
switching rates are chosen in order to minimize a certain cost which is the sum of a running cost 
and a terminal cost. The running cost depends on the player's state, the switching rate, and the 
fraction of players in each state. The terminal cost depends on the player's terminal state as well 
as the terminal distribution of players among states. 

Let Id = {1, 2, 3, ..., d}. The running cost of the reference player whose state is i is given by 
a cost c : Id x S d x (Rj) d — > M, c(i, 9, a), where 9 £ S d is the probability distribution of players 
among states, and ctj is the transition rate the reference player uses to change from state i to 
state j. We suppose c is Lipschitz continuous in 9, with the Lipschitz constant (with respect to 9) 
bounded independently of a. We suppose c is differcntiable with respect to a, and that 4^(i, 9, a), 
is Lipschitz with respect to 9, uniformly in a. 

We also suppose that c(i,9,a) does not depend on a,, is uniformly convex (on the remaining 
coordinates), that is, for any i e Id, 9 6 S d , a, a' e (R^) d , with <x,- ^ a'j, for some j ^ i, 

c(i, 6, a') — c(i, 6, a) > V a c(i, 6, a) ■ (a' — a) + 7||cv' — a|| . (2) 

We suppose that c is superlinear on ctj, j ^ i, that is, 

c(i,9,a) 

lim — n — n > °°- 

aj-too ||a|| 

The reference player has a terminal cost denoted by ip : Id x S d — > R, ip l (6). We suppose ip is 
Lipschitz continuous in 9, with the Lipschitz constant (with respect to 9) bounded independently 
of a. 

2.3 Single player control problem: the value function 

Let T > be the time duration of game. Suppose the players are distributed among the d states 
according to the distribution probability 9 : [0,T] — > S , which for now we assume to be known 
by the reference player. Let 



<4(£,a)=E? = . 



T 

c(i s ,6(s),a(s))ds + ij iT (6(T)) 



We define the value function associated to 9, denoted by ug : Id x [0,T] — > R, as 

u e (t) = minuaft, a), (3) 

a 

where i s is a continuous time Markov chain controlled by a which corresponds to the state of the 
reference player at time s, and E" =i is the expectation conditioned on the event i t = i, given the 
transition rate a. Here the minimization is performed over Markovian controls a(s) = a(i s ,s). 
More precisely 

F[i s+h =j\i s ] =a 3 (s)h + o(h) 

where lim^_,.o /^ = 0. In < j2. 51 existence of optimal Markovian controls will be proved. 



2.4 Definitions and preliminary results 

Let A, : R d —¥ R d be the difference operator on i, given by 

A i z = (z 1 -z i ,...,z d -z i ). 

The infinitesimal generator of a finite state continuous time Markov chain, with transition rate 
Vij, acting on a function tp : Id — > R, is given by 



A»(y) = Y,nAv i -v i ) = n.-&, 



Y- 



We define the generalized Legendre transform of the function c(i,0, •), as 

h(z,9,i) = min c(i,6,n) + y \Xniz 3 — z l ) 

pe(K ( |) d j 

= min c(i,6,fi) + /i • A^z . (4) 

pe(R+) d 

Because of the superlinearity and uniform convexity of c the function 

a*(z,6,i) = argmin^jj+^c^,^, fx) + fi ■ A^z (5) 

is well defined, except for its i—th coordinate, since (A^z) 1 = 0. We will denote the j-th entry of 
the vector a*(z,0,i) as a*(z,9,i), and for convenience and definitness we set 

a*(*,M = -$>*(z,M- (6) 

The definition ([6]) is consistent because c(i,9,a) does not depend on the i-th entry of the vector 
a, and for that reason ([5]) does not define cx*(z, 9, i). The uniform convexity of c(i, 9, •) shows that 
a* is well defined. We will write h(AiZ,9,i) and a*(AiZ,8,i) to stress the fact that h and a* 
depend only on A^z. Because 

h(AiZ, 9, i) = h(z, 9, i) 

there is no ambiguity of this notation. 

The following Proposition is proved in the Appendix. 

Proposition 1. We have 

a) If h is differentiable, for j ^ i 

dh(AiZ,9,i) 



a* j (A i z,9,i) = 



dzi 



furthermore, in general, for all z and v 

h(z + v,0,i) -h(z,6,i) < ^a*{z,9, i)v\ (7) 

i 

i.e. a*(z,9,i) 6 dfh{z,9,i), where d + denotes the superdiferential. 

b) The function a* is Lipschitz in p and in 9. The Lipschitz constants are uniform. More 
precisely, 

\\a*(p',d,i)-a*(p,9,i)\\<~\\p'-p\\Vp,p',6,i, 

and 

\\a*(p,e,i)-a*(p,9',i)\\<^\\d-e'\\, Vp,0,9',i. 

7 

where 7 is the constant given by (|2|) and K c is the Lipschitz constant of\7 a c. 

c) The function h is locally Lipschitz in p and in 9. The Lipschitz constants are uniform if Az 
is bounded. 

2.5 Hamilton-Jacobi equation and a Verification Theorem 

We continue to assume that 9 : [0,T] — > S d is given. As in classical optimal control we introduce 
now the Hamilton-Jacobi ODE: 

(-^ = h(A iU ,e,i), 

|y(T) = #(0(T)). K> 

This is a terminal value problem (TVP) consisting of a system of d coupled ODE 's with a terminal 
condition given by tp. It turns out, as Theorem[T] states, that the solution to this ODE is the value 
function. Before proving Theorem [T] we begin by proving a maximum principle for the equation 
(|S]), which will be also used to prove existence and uniqueness. 

Proposition 2. If u is a solution to the HJ equation (|S]), and M = max \h(0,9,i)\. Then 

(i,8)£l d xS d 

for all < t < T we have 

||u(t)||<||u(r)|| + 2M(T-t), 

where \\u(t)\\ = max{ | u x (t) |, . . . , \u d (t)\}. 

Proof Let u be a solution to (|5]). Let u = u + p(T — t). Then 

du 1 _ 

— -j- = h(AiU,9,i) + p. 

Let (i,t) be a minimum point of u on Id x [0,T]. We have vP{t) — u z (t) > hence AiU = 
(u l (t) ~u l (t),..., u d {t) - u l (t)) > 0. Therefore 

dv 1 
-—(t) = h(AiU,9,i) + p>h(Q,9,i) + p, 

because if Aip > we have 

h(A iP ,6,i)>h(0,9,i), 

since a* > 0. Furthermore, if we take M < p < 2M we get 

du 1 , N 

-IT® > °- 



This shows that the minimum of u is achieved at T hence 

u\t) > -\\u(T)\\-2M(T-t). 

Similarly, let (i,t) be a maximum point of u on Id X [0, T]. In this case we have A^u < 0. 

Hence 

du l 
— —it) = h (Aiu, 0,i) + p<h(0,6,i) + p. 

at 



Furthermore, if we take — 2M < p < —M we get 

du 



dt 



(t) < 0. 



This shows that the maximum of u is achieved at T hence 

u*(t)< \\u(T)\\ + 2M(T-t). 



n 



As a consequence of the last Proposition (and also using that h is Lipschitz), Picard Theorem 
allow us to state 

Proposition 3. The terminal value problem (TVP) given by ((5J) has an unique solution. 

Now we prove a verification Theorem: 

Theorem 1. Suppose u : Id X [0,T] — ► R is a solution to the Hamilton- J acobi terminal value 
problem ©. Then u is the value function associated to the distribution 9, and 



a(i, s) = a*(Aju(s), 9(s), i) 



is an optimal Markovian control. 



Proof. The main tool for proving Theorem [T] is the Dynkin Formula (see |Kolll) . for instance): 
suppose a is a Markovian control continuous in time. Define the infinitesimal generator of the 
process i s by 

(^» i ( S ) = ^a ii ( S )[^'( S )-^( S )]. (9) 

3 

We have that, for any function <p : Id x [0, +oo) — > R, C 1 in the last variable, and any t < T, 



E? =l [<p iT (T)-v l (t)]=E?__ 



dt 



-(s) + (A a <py°(s)ds 



(10) 



where the superscript a means that i s is driven by the the control a, while the subscript it = i 
means we are considering the expectation conditioned on i t = i. We call (|10[) the Dynkin's formula 
in analogy to the Dynkin's formula in stochastic calculus. 

Now to prove the Theorem we make ip = u in (|10p . Using the terminal condition u L {T) = 
ip' 1 (0(T)) we have that, for any control a, 



E? t=t [^(e(T))-u*(t)]=E? t=i 



du 1 - 
~dt 



(s) + (A a uy°(s)ds 



(11) 



Now let a be any control. In the next steps we will use the definition of u l e (t, a), given in ([3]), and 
then dTTJ>, ©, and 0} to have 



ul(t,a) 



K=i 


tf T (fi{T)) + J c(i s ,9(s),a(s))ds 








«'(*)+ Eg=, 


J -^f(s) + (A a u) l >(s) + c(i„ e(s),a(s))ds 




A*)+K=i 


r T d is 

/ ,. (*)+ mm YVV( S ) u ls {s)}+c{i s ,9(s),n) 
Jt at ^{K) d j 


At)+K=i 


r T du is 

/ —jj-(s)+h(Aiu(s),6(s),i 8 )ds 




u l (t), 













where the last equation holds because u is a solution to the Hamilton- Jacobi equation ©. Note 
that in the particular case where a is given by the specific control a(i,s) = a*(AiU s ,6 Sl i), we 
have equality in the all the steps above, and therefore we have Ug(t, a) = u l (t) which show us that 



a is the optimal control and that the objective function Ug(t) is indeed given by u l (t). 



□ 



2.6 Mean field Nash equilibria 

The mean field Nash equilibrium occurs when the background players are using a strategy /3 for 
which the best response of the reference player is j3 itself, more precisely when the transition rate 
from j to i at time s is given by 

p ji (s) = a*(A j u(s),e(s)J). 

The Nash equilibrium is then characterized by the system of Kolmogorov and Hamilton- Jacobi 
equations 

iiO i =Y, j e j aUA j u,9,j) 

\-|u' = fc(AiM,*), 

together with the initial-terminal conditions 



(12) 



6(0) = 6 u\T) = f/>\9{T)). 



(13) 



Note that from the ODE point of view this problem is somewhat non-standard as some of the 
variables have initial conditions whereas other variables have prescribed terminal data. We call 
this problem the initial-terminal value problem (ITVP) for the mean field game, and a solution 
of such ITVP is what we call a solution to the MFG given by T,9o,c,tp. 



2.7 Existence of Nash Equilibria in the MFG 

We now address the existence of solutions to (fT2"]) satisfying the initial-terminal conditions (TTSl . 
The proof of existence will be based upon a fixed point argument. 

Proposition 4. There exists a solution to Q12p satisfying the initial-terminal conditions p3[) . 



Proof. Let T be the set of continuous functions defined on [0, T] and taking values in S d , with the 
C° norm. Consider the function £ : T — > T that is obtained in the following way: given 9 G J 7 , 
let ug be the solution of terminal value problem given by the Hamilton- Jacobi equations © 



u*(T)=^(0(T)). 



(14) 



We know ug depends continuously on the parameters 9. 

Now get the optimal control (3 e given by the Verification Theorem (Theorem [TJ : 

j3 e (i,t) = aigmin^^+y c(i, 8, fi) + fi ■ AiUg — a*(AiUg,9, i) . (15) 

We use Proposition [1] to conclude that j3 is a continuous function of ug, and therefore of 8. 
Finally, then let £(0) be the solution to the Kolmogorov equation (^Q) given by the initial value 
problem: 

^ = X>'4 ; 9(0) = 9 . (16) 

3 

Such solution £(#) depends continuously on the parameters (3 , and therefore on 8. 

Therefore, using standard ODE theory we just proved that £ is a continuous function from T 
to J". 

Now, using Proposition^ we see from (fl"5)) that /? is bounded, with bounds that do not depend 
on 8, and therefore from (fT6| we have that £(9) is Lipschitz, with Lipschitz constant A independent 
of 9. 

Now consider the set C of all Lipschitz continuous function in J- with Lipschitz constant 
bounded by A. This is a set of uniformly bounded and equicontinuous functions. Thus, by Arzela- 
Ascoli, it is a relatively compact set. It is also clear that it is a convex set. Hence, by Brouwer 
fixed point Theorem, £ has a fixed point in C. □ 

2.8 The monotonicity hypothesis 

In order to prove the uniqueness of the MFG (t j2.10[) . and also consider the convergence of solu- 
tions of MFG to stationary solutions (when T — > oo - see ij2.13l ) we need to introduce several 
monotonicity hypothesis as in the original works by Lions and Lasry. We start with a definition: 

Definition 1. Let v G W l , and set 1 = (1, ..., 1) G R d . In R d /R we define the norm 

Hl»=inf ||« + A1||. 

Observe that 

AjU = AiV VI <i < d <=> 3cel such that u — v + cl <=> \\u — v\\$ = 0. 

Furthermore we have 

maxj(u l ) - min i (M i ) 

\\ u h = 2 ' 

Assumption 1. We suppose the following monotonicity hypothesis on ip: 

J2(e i -e i W(e)-^0))>o (17) 

i 

The previous assumption holds, for instance, if ijj is the gradient of a convex function. 



Assumption 2. We suppose that for every M , on the set ||z||j < M the function AiZ — > h(Aiz) 
is uniformly concave in the non-degenerate directions, i.e., there exists ji > such that 

h(AiZ, 9, i) - h(AiW, 9, i) - a*(A t w, 9, i) ■ (A lZ - A t w) < - 7i || A t z - A t w\\ 2 . (18) 

Assumption 3. We also suppose that h satisfies the following monotonicity property: 

9 ■ (h(z, 9) - h{z, 9)) + 9- (h(z, 9) - h(z, 9)) < -j\\9 - 9\\ 2 , (19) 

where h(z, 9) := (h{A\z, 9, 1), ..., h(AdZ, 9, d)), and 7 > 0. 

The last three hypothesis will be satisfied if h can be written as 

h(A i z,9,i) = h(A i z,i) + f(9), 

with h (locally) uniformly concave in the sense of (118[) and / satisfying the monotonicity hypothesis 

{f{9)-f(9))-(9-9)<- 1 \9-~9\ 2 . 

The previous property holds, for instance, if / is the gradient of a convex function f{9) = V$(0). 

2.9 A key estimate 

The monotonicity hypothesis from the previous section can be used to establish both uniqueness 
of equilibrium solutions, ^2.101 and a trend to equilibrium type result ^2.131 For convenience, 
rather than considering the initial terminal value problem with initial values for 9 at t = we 
consider the problem with initial values at t = —T. This will be convenient when studying the 
trend to equilibrium, which corresponds to send T — > 00 and analyzing the behavior of (9q,uo). 

Lemma 1. FixT > and suppose that (9,u) and (9,u) are solutions of (|12l) with initial-terminal 
conditions 0{-T) = 6L T ,w l (T) = ^ 1 {9{T)) and 9{-T) = 6L T ,u l (T) = ^{9{T)). Assume further 
that ||u||j, ||w||j < C . Then there exists a constant C independent ofT such that, for all < M < T , 
we have 

\\{9-e){ S )\\ 2 + \\{u-u){s)\\lds 

M 

< c{\\{9 9){MW + [[(«- «)(M)||jj + \\{9 6>)(-M)|| 2 + ||(«- u){-M)\\l 
Proof. Observe that 



d 
dt 



(u — u) 



E 



')(«*-«*) 



'Xu'-fi') 



= J2 K - «*) I E ^«i ( A i«: M - E #o?( a j«> MYj + (P - ^KKAtu, 0, i) - h{A lU , 9, 1)) 

In order to use the hypothesis (fT5]l and (JT^J) we sum and subtract some terms and we change the 
names of the variables in the double sums. 



d 
dt 



(u — u) 



= Y^ l [MA»u, 9, i) - h(AiU, 9, i)] + 9 l [h(A t u, 9, i) - h(A t u, 9, i)\ 



d 



d d 



J2 9 l [h{A t u, 9, i) - h(A iU , 9, *)] + E E ^ a i ( A * M ' e > *)("' ~ ^ 

i=l j=l i=l 

d d d 

J2 l [h{A lU , 9, i) - h(AiU, 9, *)] + E E ^«K A ^, 9, i)(u 3 -u 3 ). 

i—1 i—j i—1 
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Now using that (u l — u 1 ) \~] ct*(AiU, 9,i) = and remembering that V", ct*(AiU, 9, i)(vP — vP) 

a*(AiU,9,i) ■ (u — u), we have 



d 
di 



(u — u) 



■■ Y^ ° l [H&iU, 0,i)- h(AiU, 9, »)] + 0* [h( AS, 9, i) - h(AS, 6, *)] 

d 
d 



/i(Ai{J, 0, i) - h(AiU, 9, i) - a* (Am, 9, i) ■ (AS - Am) 



h(AiU, 9,i)- h(AiU, 9, i) - a* (AS, 9, i) ■ (Am - AS) 



Now we can use (fT5)l and (ITO1) to get the following estimate 



d_ 
dt 



(9 - 6) ■ (u - u) 



< - 7 \\6 - 9\\ 2 - £(0* + 0*)7i||Aiti - A^|| 2 . 



(20) 



Integrating $ZU\) between —M and M, for < M < T, we obtain 



((9 -6)-(u- u))(M) - ((9 -9)-{u- u))(-M) < i -7 

l-M 



/ _ 

-7ll* - °\\ 2 - £>' + ^Till^u - A^|| 2 . 



Note that (9 — (?) • cl = 0. Also for each t there exists q Si such that \\(u — u)(t) + c t l\\ = 
|| (u — u)(t)|||j. Hence 



7 ||0 - tf|| 2 + V(^ + ^) 7l ||A lU - A^|| 2 

■M „■_■, 



< ((0 - 9) • («-« + c- M l))(-M) + ((9-9)-(u-u + c M l))(M) 
<i||(0-0)(Af)|| 2 + i||( u -S)(M)|| 2 + i||(0-0)(-M)|| 2 + i||( u -S)(-M)|| 2 . 



Using that ||AjU - Ajk|| = \\u-u- («* - w l )l|| > inf ||u - u + Al| 



we have 



lW ~ 0)(s)\\ 2 + 7 ||(u - «)( S )|| 2 d S < / j\\9 9\\ 2 + J2(0* + e l H\\A t u - Aifi|| s 

M J-M ,_, 



< 



i(||(0-e)(Af)|| 2 + ||( U -S)(M)|| 2 + ||(0-0)(-M)|| 2 + ||( W -^(-M)||| 



Therefore we have proved 

|2 , \\/.. „~.V„M|2, 



-M 



|(#-fl)(.s)|| 2 + ||( W -u)( S )|| 2 d S 



1 



<^(||(&-^(M)|| 2 + ||( u -«)(M)||f + ||(^-^(-M)|| 2 + ||( u -fi)(-M)|| 2 



(21) 



D 



Lemma 2. Fix T > 0. Suppose that (9,u) and (9,u) are solutions of (|12j) wii/i initial-terminal 
conditions 6{-T) = o ,u*(T) = ^(0(T)) and 9{-T) = 9 S l (T) = ^(9(T)). Then 



\\(9-9)(s) 



(u - u)(s)\\ 2 ds < KT 3 + 4T. 



(22) 



-T 
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Proof. Note that ||(0 - 9)(s)\\ < 2. Let K = \\tp - i})\\c Q . For each -T < s < T, by the definition 
of u and u we have 



w i (s)=minE? : 



c{i u 6t,a t )dt + il> iT (0T) 



= E'? 



c(i t ,e u a t )dt + i) iT (e T ) 



and 



u»<E?=< 



c{i t ,9 t ,a t )dt + ^ T (9 T ) 



Hence 



u i {s)-u i {s) <E? ; 



(c(i t ,^,a t )-c(i 4 ,^,a f ))^ + (^ iT (^T)-^ iT (^T)) 



By the Lipschitz continuity of c and %j)'va.9 (remember that the Lipschitz continuity of c is uniform 
in a), we have 

u l (s) -u\s) < 2TK X + K a . 

Changing the roles of u and u we get 

\\u(s) - u(s)\\ 2 < KT 2 + K. 
Thus 



-T 



||(0 - 9)(s)\\ 2 + || (« - u)(a)||fds < XT 3 + (4 + 2if)T. 



D 



2.10 Uniqueness of equilibria for the initial-terminal value problem 

The first consequence of the monotonicity hypothesis is the uniqueness of equilibrium solutions 
for the initial-terminal value problem, which is a simple application of Lions-Lasry monotonicity 
method. 

Theorem 2. Suppose the monotonicity assumptions [71 fj| and [3] hold. Then the system ilfy) and 
H3\) has a unique solution (9,u). 



Proof. Suppose (9, u) and (9, u) are solutions of ([T2"]) and (fT5|) . At the initial point t = we have 
that (9 — 9) ■ (u — u) = 0, because 9$ = Re- 
integrating (|2T)|) between and T, and using the terminal conditions, we have that 

(0(T) - 0(T)) • M0(T)) - 1>0(T))) < f -!\\9 - 9\\ 2 - £(0* + ^) 7i ||A iM - A lt 2|| 2 , 



now, by assumption [T] we get 



0< 



/ -7||0 - e\\ 2 - E^ + ^)7i||Ai« - A, 
•to i=l 



"f, 



which implies that 9(s) = 9(s) for all s G [0, T]. Therefore, we have the uniqueness for 9. Then, 
once 9 is known to be unique, we obtain by a standard ODE argument that u — u. □ 
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2.11 Contractive mean field games 

We now introduce a condition that allow us to establish existence of stationary solutions as well 
as a-priori bounds for the initial-terminal value problem. 

Definition 2. Let (u) = — \~] u-> . We say that h : M. d x S d x J<j — >• R is contractive if there exists 

3 

M > such that, V0, Vi, if ||u||tt > M, then 

(Am) 3 < V j implies h(A t u, 6, i) - {h(u, 9, ■)} < , 
and 



(Aiu) 3 > V j implies h(A t u, 9, i) - (h(u, 9, •)} > . 
Conditions (J23J) an( l (124p are natural if one observes that 

{A n u) j < Vj and {A l2 u) j > Vj 

implies 

2||u||j = u' 1 -n i2 . 

So, if u is a smooth solution to (|12p and ||u(t)||j is differentiable with ||u(0lltt > M then 

d 



(23) 

(24) 



rff 



«|*>0, 



which implies the flow is backwards contractive with respect to the || • || j norm of the u component. 
The contractivity condition can be verified explicitly in many examples as we will illustrate in 
what follows. Consider the particular case 



*m)=Et+/i 



(25) 



where f l {9) is continuous on 9 £ S d . We have in this case that 



h(A i u,e,i) = f i (e)-lY,l( ui - UJ W 



(26) 



We will show now that h is contractive. Suppose first (Aiu) J < Vj. As all other cases are similar 
we assume i = 1 and 

(27) 



u 1 > u 2 > ... > u d . 



d-l 



■\Y.^- uJ f 



Therefore 



h(A lU ,e,i)-(h(u,e,-)) 



2d 

J>2 j>3 

where F\{9) is a bounded function of 9, namely 



3>1 



UY j {v?-u^+Y j ^-u^ + ... + (u 



d - l -u d ? 



Fi(9) 



\Fi{9)\\ 



d-l 



fW-lJlf*® 



3>l 



< 2 max f (9). 

8,i 



(28) 



13 



Now we multiply by —2d to have 



-2d (/i(Aiu, 9, 1) - (h(u, 0, •))) = (d - 1) 



E^ 1 -^') 2 



j>i 



E(" 2 - « j ') 2 + E(" 3 - ui ) a + - + ( ud_1 - u<i ) 2 ) - 2dF i(*) • 

i>2 j>3 



Reordering we have 

-2d(/i(AiM,(9,l)-(/i(u,6',-))) 

= Efa 1 - « J ') a + (E^ - uJ ) 2 - E^ - « J ') S 






i>2 



(5^(u x - u*) 2 - E(^ 3 - u j ?) + ... + (E( wl - « 7 ') 2 - ( ud_1 - ud ) 2 ) - 2dir i( ) • 



i>i 



Now using (J27I) we have an inequality 
- 2d (h(Aiu,9, l)-(h(u, e, •))) 

j>l j>l j>2 

+ ( e^ 1 - « J ) 2 - E( ul - m ') 2 ) + ••• + ( E( ul - u ') 2 - ( yl - ud ) 2 ) - 2 ^w 



which implies 



i>3 



i>i 



-2d(/i(A 1 u,0,l)- (h(u, $,-))) 
> E^ 1 - w j ) 2 + ((u 1 ~ u 2 ) 2 



3>1 



d-1 



+ ((u 1 - u 2 ) 2 + (u 1 - u 3 ) 2 ) + ... + (E^ 1 - ui) 2 ) - 2dF 1 (8) , 



3=1 



and the last inequality implies that h(Aiu,6, 1) — (h(u,6, ■)) < whenever \\u\\$ is large enough. 
For the case (A^u)' 7 > Vj it suffices, as before, to assume i = 1 and 



u 1 < u 2 < ... < u d . 



Then 



h(A lU ,e,i)-(h(u,e,-)) 

i-(E(« 2 -^) 2 +E(" 3 - Mj ) 2 + -+E(^- Mj ) 2 )+^w- 



2d 



i<2 



i<3 



3<c 



This implies 



h{A lU ,e,l)-(h(u,e,-)) >0. 



whenever \\u\\$ is large enough. 
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2.12 Stationary solutions 

We now discuss stationary solutions to (TT2"1) . It is clear, if, for instance h > the equation (fT2|) 
cannot admit stationary solutions in the sense that -^u = -^9 = 0. Therefore we need to consider 
stationary solutions to f|12[) modulo addition of a constant: 



Definition 3. A triplet (9, u, n) G S d x R d x R is called a stationary solution of (fT2"l> i/ 

f£^X(Ai^i) = o, (2g) 

1 h(AiU, 6,i) — k . 

If (0, m, k) is a stationary solution for the MFG equations, then (9, u — Kb) solves ([T^jl . 

Proposition 5. Suppose h : R d x S d x Ig — >• R given &?/ ([3]) is contractive. 

(a) For M large enough, the set {u G R d , ||w[[|) < Mj x 5 d is invariant backwards in time by the 
flow of equation (|12p . 



(b) There exist a stationary solution of ([T 

Proof. The first item is a direct consequence of the definition [5] and the observations thereafter. 
The second item is a consequence of Brower fixed point theorem for flows that leave invariant 
compact and convex sets. □ 

2.13 Uniqueness of stationary solutions and trend to equilibrium 

We now discuss two important consequences of the monotonicity and contractivity properties: the 
uniqueness of solutions and the trend to equilibrium. 

Theorem 3. Suppose that the monotonicity assumptions [H [21 and contractivity hold. 

(a) Suppose ||u(T)||jj < M , where u is a solution to (fl2|) . and M is large enough. Then ||u(£)||u < 
AfViG [0,T]. 

(b) The stationary solution {9, u, n) is unique (up to the addition of a constant to u). 

(c) Given T > 0, a vector do, and a terminal condition ip, let (9 T ,u T ) be the solution of (|12[) 
with initial-terminal conditions 9 T (-T) = 9q and u T,l (T) = ip l (9 T (T)). We have, when 

T ^ oc 

e» T (o)^e», ||M T (o)-u||t)^o, 

where (0, u) is the unique stationary solution for the MFG equations. 

Proof. Item (a) is again a a direct consequence of the definition [2] and the observations thereafter. 

In order to prove items (b) and (c), fix two probability distributions 9q and #o i n S d , and 
two terminal conditions if an d "0- For each T > 0, let (9 T ,u T ) and (9 T ,u T ) be the solu- 
tions of (H2) with initial-terminal conditions 9 T (-T) = 9 a ,u T ' l (T) = ip i (0 T (T)) and 9 T (-T) = 
9q,u T '' 1 (T) = tp l (9 T (T)), respectively. By the contractivity hypothesis, ||u(i)||j and ||u(i)||j are 
uniformly bounded. 

We define 

and, for < r < T, 



Ms):=\\(9 T -e T )( S )r + \\(u T -u T )(s)\\l 
Ft{t) := / f T (s)ds. 



By ([HI), we have 



F T {r)<\{fT{T) + fT{-r)). 

7 
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Note that F t {t) = / T (r) + /t(-t), hence 

F t {t) < -F t {t). 

7 

This implies J^ In Ft (t) > 7, therefore 

lnF T (r) -lnF T (l) > (r - 1)7, 
for all < r < T. From this we get 



f f T (s)ds = F T (1) < Jgg, -> whenT 



because F has sub-exponential growth, by (l22j) in Lemma [2] 
Now there exists t(T) 6 [-1, 1] with f T {t(T)) < Siil. Hence 

||0 T (i(T)) - <F(i(T))|| -+ 0, ||u T (i(T)) - u T (t(T))\U -> , 

as T -> +00. 

Recall that (# T , u T ) and (6* T , u T ) are solutions of the same time-homogeneous ODE (|12p . with 
data at time tx (9 T (tr) , u T (tr)) and (0 t (£t), u t (£t)) whose difference goes to zero as as T — > +00. 
From the continuous dependence of solutions of ODE's with respect to initial conditions, and 
observing that tr <E [—1, 1], we can conclude that 

||6» T (i) ~0 T (t)\\ ->0 and \\u T (t) -u T (i)|| 8 -^ , uniformly in t £ [-1,1], as T -» 00. 

Now, from Theorem [SJ we know there exists a stationary solution (9, u). If we choose initial 
and terminal conditions (9, ip) in such a way that (9 T (t) , u T (i)) — (9, k(T — t) + u), we have the 
convergence of (9 T ,u T ) to (9,u), which implies both the trend to equilibrium and the uniqueness 
of stationary solutions. □ 

3 The TV + 1-player game 

In this section we consider games between N + 1-players which are symmetric under permutation 
of players. As in the previous section we assume that each of the players can be in one of d states, 
and knows, in addition to his or her state, the number of players in each of the states. 

Players follow a Markovian dynamics in which each player controls the switching rate, as dis- 
cussed in i j3.ll and ij3.2l Using Hamilton- Jacobi ODE methods, M3.31 and a verification Theorem, 
i )3.4[ we formulate the Nash equilibrium problem for the N + 1-player problem. Maximum prin- 
ciple type estimates are considered in ^3.31 which are then applied to establishing the existence of 
Nash equilibrium solutions in ^3.51 

3.1 Controlled Markov Dynamics 

Remember that I d = {1, 2, 3, ..., d}, and let Sfj = {{n x ,...,n d ) € % d \\Y<Li nt = N,n l > 0}. Let 
Ck be the k — th vector of the canonical basis of M. d , and let ejk = &j — e&. 

In the preceding section we considered a game where a very large number of players was 
allowed to switch between d states. The fraction of players in each state was approximated by a 
deterministic vector 9{t). In an analogous way, we consider now a game between N + 1 players 
that are allowed to switch between the same d states. As before, to describe the game we will use 
a reference player. However, we no longer make the assumption that the fraction of the players in 
each state can be approximated by a deterministic vector 9{t). Instead, in addition to the position 
it of the reference player, we consider a second controlled Markov Chain n, taking values in S%, 
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which records the number of the remaining players (distinct from the reference player) that are 
in any of the d states at any given time. Each player knows his own state, as well as the number 
of remaining players that are in any of the states. No further information is available to any 
individual player. 

We suppose the reference player switches from state i to state j according to a switching 
Markovian rate a^- (n, t) which he or she would like to optimize upon. We suppose that each of 
the players distinct from the reference player follows a controlled Markov process k t with transition 
rates from state k to state j given by j3 — (3kj(n, t). More precisely, we have, for j ^ k, 

V(k t+h = j\\n t = n,k t = k^j =(3 k3 (n,t).h + o(h), 

where lim ^p = when h — > 0. We suppose that j3 : Id x Id x Sfj x [0, +oo) — > K is an 
admissible control, that it is bounded and continuous as a function of time, and /3kk{n,t) = 
— Ylj^f. fikj{n, t) V k,n,t and flkj {n,t) > V k ^ j. We assume further that the state transitions 
of the different players are independent, conditioned on i and n. 

From the symmetry and independence of transitions assumption, for k ^ j, we have 

¥\n t+h = n + e jk \\n t = n,i t = i) = 7 j 8,'jy(*)-ft + °( h ) . 

where lim ^p = when h — > and the transition rates of the process n are given by 

7^,'fcj(*) = n k Pk 3 {n + e ikl t). (30) 

The previous expression for the rate, namely the term n + e^ instead of n, follows from the fact 
that from the point of view of a player which is in state fc, and is distinct from the reference player, 
the number of other players in any state is given by n + e\ — e k = n + e- lk - Note that the rate 
function f3 is a deterministic time-dependent function, which makes (n, i) a non-time homogeneous 
Markov process. 

3.2 Individual player point of view 

The reference player would like to choose its transition rate a, possibly different from [3, in order 
to minimize 



/3,a 



<(t,/3,a)=E^. n) 



T 



c(i.,£,a W )* + ^(^) 



(31) 



where the subscript A t (i, n) means we are considering the expectation conditioned on i^ = i, n t = 
n. That is, reference player looks for the control a which is a solution to the minimization problem 

<(*;#) =inf<(i,/3,a), 

a 

where the minimization is performed over the set of all admissible controls a. We will call the 
function u l n (t; (3) above the value function for the reference player associated to the strategy /3 of 
the remaining N players. The control a that attains the minimum above can be called the best 
response (for the reference player) to a control j3. 

3.3 The Hamilton-Jacobi ODE for the N+1-player game 

Fix an admissible control /?. Consider the system of ODE's indexed by i and n given by 

- ^(*) = E-tf&WfoW (i) " ^™ (i) ) + h ( A ^n(t), %,i) , (32) 
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where -yp is given by O, and, as before, Aiip n (t) = (<fii(t) - <Pn(t), . . . , <p*(t) - <p l n (t)) ■ 

This system of ODE is called the Hamilton- Jacobi (HJ) ODE for the N + 1-player game 

associated to the strategy (3 of the remaining N players. 
We denote by 

||u(*)||oo=max|<(*)|, (33) 

The proof of the next Proposition is analogous to the proof of Proposition[5]and it is postponed 
to the Appendix. 

Proposition 6. Let u be a solution to (|3"2l and M — max \h(0, 9,i)\. Then for all < t < T 

(i,e)ei d xs d 

we have 

IK*)l|oo< ||«(T)||oo + 2M(T-t). 

As a consequence of h being locally Lipschitz continuous, Picard Theorem, together with the 
previous bound, allow us to establish 

Theorem 4. The terminal value problem (TVP) given by equation (|32p and the terminal condition 
(p l n (T) = 4> 1 {jt) has a unique solution. 

3.4 A verification Theorem for the N+1-player game 

Now we state a verification Theorem, which is completely analogous to the respective verification 
Theorem of the preceding section: The corresponding proof can be found in the Appendix. 

Theorem 5. Let v be a solution to (I32p satisfying the terminal condition v r n (T) = iff (jr)- Then 

<(t;/3)=<(t). 



Also, the Markovian control 



is admissible and satisfies 



i(^)(t,n,t)=o*(A i t; n (t),^,t), (34) 



u i n (t;l3)=v? n (t,l3,a{l3))- 



Thus a classical solution to the HJ equation associated to /3 is the value function corresponding 
to /3 and determines an optimal admissible control a(/3), for the reference player. 

3.5 Equilibrium solutions 

We now consider Nash equilibria for the N + 1-player game. For that we look for controls f3 for 
which the best response of any player to fS is j3 itself. 

Definition 4. An admissible control (3 is a Nash equilibrium if a(f3) = j3. 

Theorem 6. There exists a unique Nash equilibrium j3. 

Proof. A necessary condition for a control f3 to be a Nash equilibrium is that from (|34[) . we have 

Pkj(n,t) =a* [A k u„{t;f3),—,k) . 

Hence this gives rise to the system of nonlinear differential equations 

-^-T,^ l « + e ]k -<) + h(A t u n ,^, l ) , (35) 

k,j 
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with terminal condition 



where 7 fc - are given by 



<(T)=^(-J Vi€l d ,n€Sh, (36) 



Ik? = n ka* f A k u n+eik , N lk , k j . (37) 

Note that (|35[) is well posed because u n is bounded and the right-hand side is Lipschitz and admits 

a unique solution. Hence existence and uniqueness of a Nash equilibrium follows. □ 

The following property of 7^'- will be proved in the Appendix: 

Lemma 3. Let us suppose that \\AkUnWoo is bounded, and denote by z n sr — u n+e — u n . Then 
we have 

< C + CiV max Us.-Jloo. (38) 



ra+e rs ,2 n,l 

Ikj Ikj 



4 Convergence 

This last section addresses the convergence as the number of players tends to infinity to the mean 
field model derived in the previous section. 

We start this section by discussing some preliminary estimates in N4.lt Then, in J J4.2I we 
establish uniform estimates for \u n+ers — u n \, which are essential to prove our main result, Theorem 
[3 which is discussed in £14.31 This theorem shows that the model derived in the previous section 
can be obtained as an appropriate limit of the model with N + 1 players discussed in section [3J 

4.1 Preliminary results 

Let us denote by m = (i, n) G Id x Sff, and consider the system of ordinary differential equations 






7 , a n,kj 1 z n+e ik Z n I ~r / , a n \ z n z n)i 



where aL ki > and a l £ > 0. Note that this system is a particular case of 



z rn — 

m'£l d xS 



7 ^ Q"mm'{t)\Zm' z m): V"^J 



JV 



where a mm i (t) > 0. We write (I39[) in compact form as 

- z(t) = M{t)z{t). (40) 

The solution to this equation with terminal data z(T) can be written as 

z(t) = K(t,T)z(T), (41) 

where K(t,T) is the fundamental solution to (gDJ) with K(T,T) = I. Note that equations PO]) 
and (J41I) imply 

j t K{t,T) = -M{t)K{t,T). (42) 

The proofs of Lemma |4j Lemma [6] and Lemma [7] can be found in Appendix. 
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Lemma 4. For t < T we have 

Mt)\\oo < IKTJIloo 

(see (|33p ). Furthermore, if z(T) < then z(t) < 0. 

From the previous Lemma we also conclude 
Lemma 5. If pi < P2, and t < s, then we have 

K(t,s)pi<K(t,s)p 2 , 
which means K(t, s) is an order preserving operator. 

Proof. Observe that if p\ — pn < then K(t, s)(pi — p-i) < 0, by Lemma|4] D 

Lemma 6. Suppose z is a solution to 

-z(s)<M( S )z( S ) + f(z(s)). (43) 

where M(t) was defined in (I39p and (|40[) . Then, for all m = (i, n) £ Id x <S^- 

4(*) = **(*) < \HT)\\oo + J \\f(z( S ))\\oods. 

Lemma 7. Suppose v : [0, T] — Y K is a solution to the ODE with terminal condition 

( dv _ n„, i r*AT„,2 i C 



c 

A'- 



«(T) < %, 



(44) 



where N is a natural number, and C > 0. Then, there exists T* > 0, which does not depend on 

2C 

N 



N, such that T < T* implies v(s) < ¥f- for all Q < s < T. 



4.2 Gradient estimates 

In this section we prove "gradient estimates" for the N + 1-player game, that is, we assume that 
the difference u n+ers — u n is of the order jt at time T and show that it remains so for < t < T, 
as long as T is sufficiently small. 

Proposition 7. Let u l n (t) be a solution of (|35[) with terminal conditions (|36[) . Then there exists 
C > and T* > such that, for < T < T* , we have 



ma*l|Un+ er .(*)-<(*)lloo < — 



2C_ 

TV' 



for allO<t<T. 

Before proving the Proposition, we remember the norm || ■ H^ was defined in (|55|) . 

Proof. Using the terminal condition (|36l) and remembering that ip is Lipschitz continuous, we 
know that there is a constant C > such that 

maxK^T)-^)!!^-^. (45) 
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Let Z n,sr = <+e r3 ~ <■ We have 



n+e rs ,z I i 



7 , Tfcj' ' ' ^ ""n+e^+ejfc u n+e„ j 7fcj ^ u n+e ]k 
k,j 

-h[ AiU n+ers , — Tr 1 ^) - h(&iU n ,—,i 






e rs ,i i n,% i 

Z n+e r3 ,kj ~ Ikj Z n,kj 



h[ AiU n+ers , — Tr 1 ^) ~ h(AiU n , —,i 



E 

k,j L 



-hi A,-u n+era 



Jkj ±Tkj 
2 

n + e 



J n+e rs .kj n.kj 



E 

k,j 



n+e rs ,i n,i 

Ikj ~ Ikj 



'n+e rs ,kj ' n,kj 



,i — h[ AiU n , ~r:,i 



INOte tliat ^ n +e r s,kj z n,kj u n+e rs +e jk u n+e rs u n+ej k ^ u n z n+ej k .sr z n,sr- 



From Lemma [21 we have 



n+e rs ,i n,_ 

ikj Ikj 



< C + CNmax\\z] sr ||oo- And note that 



E ( Z n+e r s,kj + Z n,kj ) ^ 2 E 
k,j V J k,j 



,kj\ 



< 2d max 1 1 z. 

k,j 



,kj\ 



Hence 



E 

k,j 



iir 



Tfei 



n+e rs ,kj ' n,kj 



< Cmax II z 



,kj\ 



CNmaxWzyWlc. 



Using item (a) of Proposition [T] and also that z l nsr N^ a* I u n , — , i J = 0, we have 



hi AiU n+firs , — rr ^-, i ) — h[ AiU n , —,i) = h[ AiU 



N ,-. •-. — .-,„ N ,» I ..i— i<*n+e TS , N , 



Iv | &ts 



i\ —hi AiU n+eTS , —,t 



ri 



+ h[ Aiu n+ers7 —,i\ - hi AiU n , — ,i 



C_ 

N 



< T7+E"'* ( A * U ™'T7> J I I : 



z i ) 



Now denoting by a l n 



kj,sr 



IT 



_ 1 2W ) and fl M = a * ( AiUn , » , j), we get 



-z l < 

n,sr — 



E 

k,j 



n,fcj,sr I ^n+ejfc,sr n.sr 



/ , a n \ z n,sr z n,sr) ~T~ J \ z ) i 
I 



C 

where f(z) = — + Cmax ||2.' >ar .||oc 



CiV max || z' C J|L 



At this point we are in position to apply Lemma [5] from the previous section. We obtain 



C 



z n, S r(t) < lk.', S r( T )lloo + / Cmax \\z[ (s) IU + CN max ||«; jSr (s)||^ + — ds 



iV 



Finally, as z nsr — u l n+f , Ts — u l n , if we set w = max ||u^ +e? , s — "nlloo we conclude that 



C 



w{t) < w(T) + / Cw(s) + CNw{sY + —ds. 
Jt A 
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Now we define 

We have that 
and also that 



v (t) = w(T) + S Cw(s) + CNw{s) 2 + —ds 



N 



W(t) < T)(t), 



(46) 



drj 
~dl 



(t) = -g(w(t)), 



%. Thus 



where g is the nondecreasing function g(w) = Cw + CNw 2 

(%(t)>-g(v(t)), 

\ V (T)=w(T). 

A standard argument from the basic theory of differential inequalities can now be used to prove 
that r)(t) < v(t) for < t < T, if v(t) is the solution of 

[f t {t) = -g{v{i)), 
\v(T) = w(T). 

This last result can be combined with Lemma [7J the inequality (1451) which means w(T) < j* 
and the inequality (|4"5|) . to prove that w(t) < ^ for all < t < T, which ends the proof of the 
Proposition. D 

4.3 Convergence 

In this section we prove Theorem which implies the convergence of both distribution and value 
function of the N + 1-player game to the mean field game, for small times. 

Let 8 = (9q,0q. . . ,9q) £ S d be given. We start by assuming that at the initial time the TV 
players distinct from the reference player are randomly assigned states 1,2, ... ,d independently 
according to the initial distribution # (Le. choosing state k with probability 0q). Therefore, n 
is a random vector of 7L d that follows a multinomial distribution with parameters N and 8q. 

We will write n l t for the 1-th coordinate of n t , which means the number of players (distinct 
from the reference player) that are in state I at time t. 

The norm we use for vectors of R d , in this section, is the norm ||t?|| = max-fjv 1 !, |u 2 |, ..., |i> d |}, 
where \v l \ is the absolute value of the z-th coordinate of v. 

The main result is the following: 

Theorem 7. Let T* be as in Proposition Hj There exists a constant C , independent of N , for 
which, if T < T* , satisfies p — TC < 1, then 



V N (t) + W N (t)< 



for all t £ [0,T], where 



and 



V N (t)=E 



N 



C 1 
1-pN 



6(t) 



W N (t)=E tt(t)-<(t) 



where the pair 9(t) and u = u(t) is the solution of the MFG game Ulty). and u N = u N (t) is the 
value function of the N + l-player game, i.e., the solution of game (|35p . 
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Before proving Theorem [7] we need two Lemmas. Let 



< 



V N (l,t) = E 



2t -6 l 

N u 



(0 



(47) 



W N (l,t) = E[(u l (t)-u% l (t)y 
We have 

V N (t)=max{V N (l,t),...,V N (d,t)} and W N (t) = max{W N (l,t), . . . , W N (d,i)}, (48) 

and 



V N (l,0) = Vai 



r„n «' 



JV 



'&( i - g&) 

N 



(49) 



because rig is the sum of N independent and identically distributed random variables, each of 
them having Bernoulli distribution with parameter (9 . 

An important tool for proving Lemmas |S] and [9] is again the Dynkin Formula, now adapted to 
the present situation: define the infinitesimal generator of the process (i, n) acting on a function 
tp : Id x Sff x [0, +oo) — > M, C 1 in the last variable, by 

A a ip(i,n,s) =^2a^[ip(j,n,s) -<p(i,n,s)] + ^2n k a^f[<p(i,n + e jk ,s) -<p(i,n,s)], (50) 

3 kj 



where 



'ki 



N,i * I A N n + e ik , i 

a u J = a, I A k u n+e . h , — - — ,k ] , 



N 



(51) 



is the transition rate from state k to state j in for equilibrium solutions of the N + 1-player game, 
as in section 1331 

Then for any t < T, 



E[^(i T ,n T ,T)-<p(i t ,n t ,i)]=E 



-£(i,n,s) + A a cp(i,n,s)ds 



(52) 



Note that in the right hand side of the equation above, the processes i and n are evaluated at 
time s. 

We will also denote by 

a v = a*(AiU,9,i) 

the transition rate from state i to state j in the equilibrium solutions of the mean field game as 
in section E21 



Lemma 8. Let T* be as in Proposition^ and suppose T < T* . There exists C\ > such that 



Vy(l) : / C\(V N (s) + W N (s))ds + ^. 

2 







Proof. Using Dynkin's Formula (|5"2"j) with (pi(i,n, s) — ( jj — l (s)j , and (|4"9")l . we have 



\ \ (/•/)- g ° (1 Ar 6o) = E / ( wjv.,00 + «*,,(*) ) ds , 



where 



Sn,i( s ) 



N 



^.)-2(^Wa^ 
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and 



UNA s ) = J2J2 nka kf 
k j 



tpi(n + e jk ,s) -<pi(n,s) 



Note that <fi{i, n, s) just depend on n z and s. Therefore ipi(n+ ejk, s) — (fi(n, s) if both j ^ I and 
k 7^ I. Hence 



I \ V^ k N,i 

UN,l( s ) = 2^ n "W 



5> 



k n N,i 



a 



/,•/ 



fc/i 



i/?;(n + e/fe,s) - <#(n, s) 

n< + 1 



l a N <* 



' E n '"/, 

j£l d ,k=l 



(fi(n + eji,s) - tpi{n,s) 






'a^ 



iV 



n ( - 1 



iV 



iV 



2 / 1 

rr 



iV 



J 



2[ — -e l , 

N y N ^ N 

kjtl 



h)j277 a ki l + ( 2 (-T 



»i 



,1 



E 



N 2^ N a ki 

7 k^l 

n k N4 , C 



"s 



N ) N ) N ^ to 
x lW na 



N N 



-«; 



21 'n' 6 ') ^W l < ' A 
k&i d 



N,i ■ 



where in the inequality above we used the fact that a kl ' 1 is bounded (with bounds that do not 
depend on AT - here we are using that a* is Lipschitz and AiU n is bounded - see Propositions [1] 
and [7]). Now 



Qv,i(s)+^jv,K s )< 2 ( ^-0* 



^ 



--■i [ — -e l 

N 



— > 
k 


n N.i 

N a kl 


k a H 


C 

+ N 


— ^ 


n N,i 


nk , nfe nk 


k 


~N a ki 


~Tf a kl + -Tj®-ki-v a k i 


— > 
h 


N \ akl 


-a H )+« fe *(^ fe ) 



c_ 

N 
C_ 

"n 



Then 



V N {l,t)=El (u Ntl {s) + <; N , l ( S ))d S + do ( 1 6o) 



N 



<E / 2 



t / n / 



N 



'-'A* 



E ~W [ a kl l - a kl ) ds 
k ^ 7 

V- f nk nk\, CT+l/A 



Now, using again the fact that a* is Lipschitz and 

||A fc u; - A fc *|| < \\w-z\ 
we see that 



(53) 
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I N.i 



N n + e ik 



,k) -at(A k u,9,k) 



<K 

< K 

< K 



n + e tk 



TV 



\\< 



•"5 



AT 

+e, fc 

N 



N ■ Il^n'+eu -"nil + ll«n ~«| 



2 + 2C ,, jy | 



TV 



where in the last equality we used the gradient estimates of Proposition [71 
Therefore 



V N (l,t) <2K 



n' 

TV 



n 

TV 



2 + 2C 
TV 



* /„! 



^-'EM7-^ 



+ HtC-u|| rf.s 



GT+1/4 



<2KE 



+C 3 E / 2 



n< 




f\\n 


n 






-H l 


6 


+ 


TV 




V II V 




n' 

TV 


-e l 


E 

fe 


TV 



TV / TV 

+ U<-«I|W 



ds 



Ck 

TV 



where we used the fact that ocm is bounded by a constant C3, and C4 = CT + 1/4 + 2T + 2C$T. 

Now 



Vw(M) <2ifE 



-2dC 3 E 



TV 



+ IK-«II 



TV 



TV 



ds + 



9± 

TV 



Finally using ab < a 2 + b 2 and (|48|l , we have 



V N (t) < J dCMs) + W N {s))ds + ^ , 



where d = 3 max{2if, 2dC 3 , C 4 }. 



<Ts- 



D 



Lemma 9. Lei T* 6e as in Proposition^ and suppose T < T* . There exists C2 > such that 

W N (t)<J C 2 (V N (s) + W N (s))ds+^r. 

Proof. Using Dynkin formula (|52")l with (pi(i,n,s) = [u^' l (s) — u l (s)\ , and equations (|35p and 
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(|T2| . wc have 



W N (l,t)-W N (l,T) = -E 



=E / 2«> J -u i )^«''-^ 



(^(r)- u '(T)) 



)ds + E ^2 n 

+ E / T E nfe <4(-n4,--0 2 -(^- U ') 






+ E /~ (2(u^ l ~u l ))(h(A l u 1 6,l)~h(A l u% 1 ^,l))d S 

Ek N.if N,l N l\ 2 j 

n a fci ' (u„+e^ " «„ J ds 

+ E / (2«' , -« I ))(/i(A,« l ^0-^(Aj«£ r > ^ > l))ds. 
In the last equation we used the fact that 

-2(<.« - u') («%, - C) + (uX ih ~ tf - («*V -«')' = («& 

Now, using the gradient estimates from N4.21 Proposition [71 we have that 

, 2 i^2 



(<' 



u'^-fu^-t/ 2 



,7V./ 



/ iVJ 






/■M < 
n / - AT2' 



which implies 



Efc JV.t I N.l 

jk 



2 dK 7 
< 



N 



For the same reason we have that Wn(T) is bounded by %£■, which implies 



K A 



W N (t) < -j± + 2E / (ft(A,tt, 0, 1) - fc(A,u£, £, 0) («£' 



(is . 



Using the fact that h is Lipschitz in both variables, with Lipschitz constant uniform (since Au is 
bounded) and ([55]) we see that 



h(A lU ,ej)-h[A lU ^,-,l) <K 



N 



+ \\u" - u\ 



Therefore, using u„ ' (s) — w (s) < ||u„ — tt|| and again ab < a + b , we have 



W N (t)<^+K 5 [ V N (s) + W N (s)ds, 



which ends the proof. 



□ 
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Now we can prove our main result that establishes the convergence of the N + 1-player game 
to the mean field model as N — > oo. 

Proof of Theorem [7j 

Define C = C\ + C<2. Adding both inequalities given in the two last Lemmas, we have 

W N (t) + V N {t) <C J (V N (s) + W N (s))ds + ^ . 

Now suppose p = TC < 1. Defining 

W N + V N = max W N (t) + V N (t), 

0<t<T 

we have 

W N + V N < p{W N + V N ) + —, 

which proves the Theorem [7] □ 

5 Potential mean field games 

An important class of examples are potential mean field games, which have additional structures 
that can be used to deduct further properties. In these mean field games h has the form 

h(z,e ) i) = h(z,i) + f(e) (54) 

where h : R d x 1^ — > M and / : M. d x 1^ — > R is the gradient of a convex function. More precisely, 
we suppose that there exists a convex function F : W l — >• R such that \7gF = /(•, 6). 

5.1 Hamiltonian and Lagrangian formulations 

Let H : R 2d -> R be given by 

H(u,e) = J2d i h(A i u,i) + F(0) (55) 

i 

= e-h(A.u,-) + F(e). 

A direct computation shows that (fl^j) can be written as 

M. -ai 

dui ~ u ' 

(56) 

m - -ui 

This means the flow generated by equation (TT2|) is Hamiltonian. In addition to the fact that the 
Hamiltonian is preserved by the flow ([55]). the special structure of the H, which depends only on 
Ajti, implies that J^ l is also a conserved quantity, which is consistent with the interpretation 
of 9 in terms of probability distribution of players. 

Given a convex function G(p) we define the Legendre transform as 

G*(q) = sup -q-p-G(p). 
P 

If G is strictly convex and the previous supremum is achieved, then q = — VG(p), or equivalently 
p=-VG*{q). 
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If the function F is strictly convex in 8 then the Hamiltonian H is strictly convex in 9. This 
allow us to consider the Legendre transform 

L{u, u) = sup — u ■ 9 — H(u, 9) 
e 

= sup -(ii + h) -9-F{9) = F*(u + h{A.u,-)). 

9 

From this we conclude that any solution to (TT^I) is a critical point of the functional 

F*(u + h(A.u,-))ds. (57) 

o 

This variational problem has to be complemented by suitable boundary conditions. The initial- 
terminal value problem corresponds to 

9 = -VF*(u(0) + h(A.u(0),-)), 
u(T) = rp(-, -VF*(u(T) + h(A.u(T), •))). 

Another important boundary condition arises in planning problems. In this case the objective 
is to find a terminal cost u(T) which steers a initial probability distribution 9q into a terminal 
probability distribution 8 T . Hence we have the following 

9 = -VF*(u(0) + h(A.u(0),-)), 
9 T = -VF*(tt(T) + h(A.u(T), •))• 

The variational principle ([57| is an analog to the results in [GPSMlTl GSM11 . 

5.2 Two PDE's for the value function 

We will present now a PDE for the value function. As pointed out by Lions in his course in 
College de France, as well as in |Guellb[ IGuella) the value function of the mean field game can 
be determined by solving a PDE. For this let g : R d x S d x I d — > R d be 



g{u,6,i) =^B j a*(AjU,e,j). 



The first equation of (JT2J) is equivalent to -^9 l — g(u,9,i). 
Consider the PDE 

--g r (e,t) = h(u,e,i) + J29(u,e,k) w (e,t), (58) 

k 

where U : Id x S d x [0,T] ^> W., and the terminal condition 

U i (8,T) = ip i (8). (59) 

A direct computation show us that the following Proposition holds: 

Proposition 8. Suppose U : I d xS d x [0, T] ->• R is a solution of ([HED and dSHD- Let 6 : [0, T] -» S d 
and u : [0, T] — > W* be two functions such that 



1. the first equation of (JX3J) is satisfied, i.e. -^8 t — g(u,6,i); 

2. 9{0) = 9 ; 
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3. u l (t) = U l {9{t),t). 

Then u satisfies the second equation of (|12p . i.e. — -jtU 1 = h(AiU,9,i) as well as the terminal 
condition u l (T) = ip l {9{T)). Therefore, u is the value function associated to 9, and so it determines 
a Nash equilibria for the MFG. 

As a consequence of the above Proposition, if U is a solution of (|58p and ([5!?]) . the initial value 
problem 

U t 6*=g(UH9(t),t),0^), 

\e(o) = e , 

can be solved by the usual methods of the ODE theory to find a Nash equilibrium 9 and the 
associated value function u % (t) = U l (9(t),t) , for any initial distribution O - Also, the function U 
allows one to calculate the optimal strategies for each player, at any time. In fact the optimal 
switching of a player in state i, given the distribution 9 of players, is a*(AiU(9,t),9,i), for 

1 < j < d. 

We should observe that [[35]) can be regarded as a discretization of ([58]) . Indeed, set 9 = -^ 
and assume u % n ~ U l (9,t), for some smooth function U. Then, for large N, from ([57]) we have 



Ikj n * A n + e ik 



Furthermore, 



<a*[A k U,6,k\. 



u n+e jk u n _ u n+e jk u n+ej + U n+ej U n 



Therefore 



1/JV \/N 

dU l dU* 

~ ~~d9 T + ~cW' 



. . , / fITP rlJT i 

E tG 1 («U* - <) - E *S* W, OM {%--%: 
= J2e k a*(A k u,e,k) w , 

taking into account that J? ■ a *j — 0. Observing that 

FflP r)TP 

k,j 3 

and 

h(AiU n ,—,i) = h[u ni —,ij ~ h{U,9,i), 

we conclude, from the above and ([55]) . that 

-—(o,t)~h{u,e,i) + J29(u,e,3) w V,t). 



29 



For potential mean field games (|58p can be further simplified if we suppose that the terminal 
condition is given by a gradient 

U i (9,T)=V i9 T (i,6). (61) 

In this case let $ be a solution of the PDE 

|tf(0,T)=* T (0). 

Then a direct calculation can show that U l (9,t) = Vgi\&(0, £) is a solution of (l58|) together with 
the terminal condition (|6"TT) . We should observe that the solutions to the Hamilton's equations 
([56]) are in fact characteristic curves for (|62l) . The Hamilton- Jacobi PDE (|6"!?1) was explored in 
[Guellbl IGuella) to the study of MFG problems on graphs. 

A Auxiliary Results 

Proof of Proposition [1} 

To prove the first item we use the definition of h and a* and also that v 1 2_. oij(z, 9, i) = to 

3 

get __ __ 

h(z, 9,i) + J2 a *j («> e > yj = c («> 0. a *( z > e > 0) + H a j ( z - e > *) ( zJ + wJ - z * " yl ) • 

3 3 

Hence by the definition of h(z + v, 9, i) we have 

h(z, 9,i) + y2 a*(z, 9, i) v j > h(z + v, 9, i). 

3 

From this J7J) holds and we deduct that if h is differentiable 

dh(Ajz,e,i) 

a 3 dzi 

Note that item (c) is a direct corollary of item (b) , since 

h(p,9,i) — c(i,9,a*(p,9,i)) + a*(p,9,i) ■ p (63) 

and the function c is Lipschitz in 9 and differentiable in a. 

From this point on in this proof we will omit the index i as it is not relevant and simplifies the 
notation. To prove item (b) we will use the following inequalities, which are consequence of the 
uniform convexity of c: for all 9,9' £ S d , a', a G (Rj) d , J2k a k = J2k a 'k = 0' an d P,p' & K d , we 
have 

c{9, a')+a -p' > c(9, a) + a ■ p' + (V a c(9, a)+p')- {a - a) + j\\a' - a\\ 2 , (64) 

and because a*(p, 9) is a minimizer, 

{V a c{9, a* {p, 9))+p)- {a' ~a*(p,9))>0. (65) 

We will first prove that a* is uniformly Lipschitz in p : for that, we suppose that 9 is fixed. 
By the definition of a* and equation (|64[) we have 

c(o*(p)) + a*(p) -p' > c(a*(p')) + a*(p') ■ p' > 

> c(a*(p)) + a*(p)-p' + (V a c(o*(p)) + p') • (a*(p') - a*(p)) + 7||«V) ~ a*(p)\\ 2 , 
hence 

30 



> (V a c(a*(p)) + p) ■ (a*(p') - a*(p)) + (p'-p) • (a*(p') - a») + j\\a*(p') - a*(p)|| 2 . 
Now using (|55|) we obtain 

> (p' -p) • (a*(p') - a*(p)) +tK(p') - «*(p)|| 2 . 



Therefore 
which implies 



||p -p' II ||a*(p') - a*(p)|| > 7l|a*(p') - a*(p)|| 2 , 



|a*(p') -a*(p)|| < - p'-p 
7 



This shows that a* is uniformly Lipschitz in p. 

Now we prove that a* is Lipschitz in 9: for that, we suppose that p is fixed. Again by the 
definition of a* and by equation (|64p we have 

c(6',a*(e)) + a*(6) ■ p> c{9',a*(6')) + a*(6') ■ p 

> c(6',a*(9)) + a*(e) -p + (V a c(0',a*(0)) +p) • (a*(0') - a*(0)) + 7 ||a*(0 - o*WI| 2 , 

and then 

> (V„c(0',a*(0)) +p) ■ (a*(9') - a*(9)) + 7 |K(0') - a*{9)\\ 2 . 

Using equation (|rJ5|) we get 

0> [V a c(0',a*(d))-V a c(6,a*(d))] ■ (a* (9 ') - a* (9)) + 7 \\a* (9 ') - a* (9)f . 
As V a c(9, a) is Lipschitz in the variable 6 we have 

K c \\9> - 9\\ \\a*(9) - a*(9')\\ > >y\\a*(fi') - a*(9)\\ 2 . 



Therefore 



\\a*(9)-a*(9')\\ < —\\9-6'\ 
7 



which implies that a* is Lipschitz in 



a 



Proof of Theorem [5} 

The main tool for proving Theorem [5] is once again the Dynkin Formula, now adapted to the 
present situation: suppose a and /3 are two admissible controls. 

We recall the infinitesimal generator of the process (i, n) defined in (f5U|) . We have 

(A^Ms) =^^-(n, S )[^( S ) - ifUs)] +Y,^U s )y i n+ eJs) ~ <&(*)] (66) 

J kj 

where 7^ is defined by (fSU|) . 

We have that, for any function ip : I4 x <!>$• x [0, +00) — » R, C 1 in the last variable, and any 
t<T, 

*&„) [<(T) - *£(*)] = Ej^, [/ T ^f 



-(.s) + (A^»^( s )& 



(67) 



where A t (i,n) denotes the event U = i and n t = n. 

Now we prove the theorem. In the Dinkyn formula (|67p let ip = v. Using the terminal condition 
v nr{T) = V> lT (lv") we have that, for any admissible control a, 



A t (i,n) 



^(f )-<*(*) 






T cfc> 



fi/ 



-( S ) + (A^<»& 



(68) 
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In the next steps we will use the definition of u, given in (f5T]l . and then (1551) . (f5r?| , (U]) to have 



u^t, /3, a) 



E 



/3,a 



A t (i,n) 

<(t)+E 



**(£) 



a s Ids 



/9,a 

A t (i,n) 



dvl 



dt 



(■ Hi 



-( S ) + (^^)i 1 ^( S )+c(i s ,^ 



TV 



r/.s 



> 



0,a 



<(*)+%n) 



T cfoi 



rff 



-(«) 



ceW)"" 



X]^K,( S ) -«n.( s )] +c(i s , ^,A*) 



5Z 7 iw ( s ) K;+ ejfc (*) - < ( s )] 



kj 



ds 



<(t) + K: M 



* fej" 



<(*), 



where the last equation holds because v is a solution to the Hamilton- Jacobi equation (l3"2j) . Note 
that in this last calculation we are also proving that, for the specific control a given by ([34]) . 
we have u l n {t, fi,a) = v l n (t) which show us that a is the optimal control and that the objective 
function u l n (t, /?) is given by v l n (t). 
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Proof of Proposition [6} 

Let u be a solution to (l32|) . Let u = u + p(T — t). Then 

Let (i,n,t) be a minimum point of u on /^ x <S^ x [0,T]. We have u z n+e . > u l n - This implies 
y',kj(<+e Jk ~ K,) > 0. We also have u 3 n (t) - u l n {t) > hence A;u„ = ({£(*) - <(<), ...,«£(*) - 
«U<)) > 0. Hence 



— |r(*) ^ ^ ( A ^»> 77 > l ) + P ^ h (°» ^»») + P. 
because the definition of h(Aip, 9, i), with Aip > 0. Furthermore, if we take M < p < 2M we get 

-§«»»■ 

This shows that the minimum of u is achieved at T hence 

<(*)>-|KT)|| 00 -2M(T-t). 

Similarly, let (i, n, t) be a maximum point of u on Id x Sfj x [0, T]. We have u l n+e . < ujj. This 
implies 7«'L (w^ +e . fe — ujj < 0. We also have AjU„ < 0. Hence 

— ^ (*) < ft (Aitin, -, ij + p < ft (0, -,l) + p. 

Furthermore, if we take — 2M < p < —M we get 



■A-2 



This shows that the maximum of u is achieved at T hence 

u i n (t)<\\u(T)\\ 00 + 2M(T-t). 
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Proof Lemma [3} 

Recall that a*(p,9,i) is Lipschitz in (p,9). Let K be the corresponding Lipschitz constant. 
Since ||p|| bounded, we have \a*(p, ., .) < C. Then 



n+e rs ,i n,i 

7/3, kj ' ~~1p,kj 



(n + e rs ) k a*(A k u n 



Tl -\- &rs < &ik 



+e rs +e ik > 



N 



kj - n (Xj I A k u n+e . k , — — — , kj 



(n + e rs ) k -n k a*[A k 

Tl -\- C rs -\- &ik 



n + e rs + e ik 



N 



$(A*t 



Oi*( A fe U n+eifc , 



< 



N 
N 






AfcU„+ ers + eifc , - 
A k U n + ers + eik 



N 
n + e rs + e ik 

N 

Tl T" Crs ~r Cifc 



fc ) ~ a d ( 



ATI ~r G rs T" Czfc 7 
fe M n+e, fc , ~ ,« 



A r 



)-«;( 



n + e ife 

AfeM„ +eifc , — ,k 



N 
n + e rs + e lk 



lA k u n+eik , 



N 

IT ~\- &rs i &ik 

N 



\ */ A n + e rs + e lk 

ij - OLj \A k U n+eik , rr A 



N 



,k) -a* (A k u n+eik 



n + e lk 
N ' 



N 



<C + NK\A k (u n+ers+e%k - u n+eik )\ + NK 

<C + NK2\\(ui +ere+eih -ui +e J\\+C 
= C + CNWz^^W <C + CiVmax ||z' sr | 



Proof of Lemma |4j 

Let z be a solution of P0|) . and fix e > 0. We define z = z + e(t — T). Hence 5 satisfies 



□ 



= -e+ ^ a mm '(t)(; 



■m *"m) 



Let (m,t) be a maximum point of z on Id x <Sjy x [0, T]. We have z m (t) > z m '{t) and this implies 
Orom'(*)(^m' ~ * TO ) ^ ° ^ m '- Hence 

This shows that the maximum of z is achieved at T. Therefore, for all (m',i), 

z m >(t) + e(t - T) = z m ,(t) < z m (T) = z m (T) . 

Letting e — > 0, we get 

Zm'(t) < maxz m (T), V (m',t). 

m 

From this inequality we have the following conclusions: 

1. if z(T) < 0, we then have z m >(t) < , for all (m',t), and so z{t) < 0; 

2. for all (m',t), 

Z m >{t)< |k(T)||oo. 
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Now we define z — z + e(T — t). Hence z satisfies 

Zm £ T / Q>mm' \t) \Zm f Zm) • 

m'eI d xS d N 

Let (m, i) be a minimum point of z on /^ x <S^ x [0, T]. We have a mm /(t)(z m i — z m ) > 0. 
Therefore we have 

_^™( t )> e . 

This shows that the minimum of z is also achieved at T, hence for all (m' , i) we have 

z m >{t) +e(T-t) = Z m >(t) > z m (T) = z m (T). 
Letting e — > 0, we get z m i(t) > minz m (T). Hence 

771 

and therefore we have ||z(£)||oo < ||ar(!T)|| O0 . 



Proof of Lemma [6j 

We note that if t < s < T we have if (t, s)K(s, T) = K(t, T), which implies 



±[Ku.s)K(s.T)) =.). 



Hence, using equation (H21) we get 



-/v(/.s-).U(.s)/v(.s.D- (— if(i,s)) A"(.s.D = (i. 



and therefore, by taking T — s we conclude that 



Multiplying (14"51) by K(t,s) and using Lemma [SJ we have 

-if (t, s)i(s) < if (i, s)A/(s)z(s) + if (i, s)f(z(s)) . 

Using the identity 

—Kit, s)z(s) = if (i, s)i(s) + Kit, s)Mis)z(s), 
as 



which follows from (fS!J|) . we get 



~(K(t,s)z(s))<K(t,8)f(z(a)). 
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, if(*,s) =if(t,s)M(s). (69) 

as 



Thus, integrating between t and T, we have 

z(t) - K{t, T)z{T) < J K(t, s)f(z(s))ds. 

Note that if z(t) = K(t,T)z(T) is a solution of (|40l) with terminal data z(T) = 6, then Lemma 
Himplies that Wz^t)^ < ^(T)^, hence \\K(t,T)z{T)\\ 00 < ^(T)^. 
Therefore for all m E Id x 5^ we have 

Zm(t)<\\z(T)\\ 00 + f ||/(«(«))||ood«. 
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□ 



Proof of Lemma \7\ 

Note that (|44p implies that v is a monotone decreasing function of s and is equivalent to 



-l 



'da _ 

dv Cv+CNv 2 + §- 



This implies by direct integration that 

'2C X 



W ]<T 



dv 



Cv + CNv 2 + % 



Now 



dv 



Cv + CNv 2 + ft 



> 



N 



zdv = 



1 



2C+4CTT' we nave *^ at s (n 



2C 2 + 4C* 3 + C 2C + AC 2 + 1 ' 

< if T < T*. Hence this implies 



Therefore if we define T* 

v(0) < ^, which yields the desired result when we take into account that v is a decreasing 
function of s. 
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